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Abstract 

Different criteria (Shannon's entropy, Bayes' average cost, Diirr's normalized rms spread) 
have been introduced to measure the " which- way" information present in interference ex- 
periments where, due to non-orthogonality of the detector states, the path determination 
is incomplete. For each of these criteria, we determine the optimal measurement to be car- 
ried on the detectors, in order to read out the maximum which-way information. We show 
that, while in two-beam experiments, the optimal measurement is always provided by an 
observable involving the detector only, in multibeam experiments, with equally populated 
beams and two-state detectors, this is the case only for the Diirr criterion, as the other two 
require the introduction of an ancillary quantum system, as part of the read-out apparatus. 



1 Introduction 

The debate on double-slit interference experiments, with photons or matter particles, and on 
the possibility of detecting, as proposed by Einstein, "which-way" individual particles are tak- 
ing, helped to shape the basic concept of complementarity in quantum mechanics. According 
to this early discussion, Young interference experiments were showing the wave nature of both 
radiation and matter and any attempt to exhibit their, complementary, particle nature, by 
detecting which path each an individual quantum was travelling, was regarded as implying 
a disturbance capable of destroying the interference pattern. Q It was, however, much later 

*For an analysis of the Bohr- Einstein dialogue and reprints of the relevant papers see |l| 
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noticed that "in Einstein's version of the double-slit experiment, one can retain a surprisingly 
strong interference pattern by not insisting on a 100% reliable determination of the slit through 
which each photon passes" pi . 

More recently this problem has been thoroughly investigated both from a theoretical and an 
experimental point of view, by proposing c/edan&en-experiments or actually performing them, 
in which the quantum unitary evolution of both the system and the detector is completely 
under control. In many cases care is taken of having the detectors acting on internal degrees 
of freedom, so that they do not disturb directly the centre of mass motion. 

As it is well known the partial loss of contrast of the interference fringes, their modification 
or total disappearance, find a complete quantum mechanical description in terms of the entan- 
glement between the interfering particles and the detectors. To be more precise, the unitary 
evolution describing the interaction of the system with the detectors leads to the entangled 
state 

|* (t) >= \Mt) > ® Ixi > +\Mt) > ® 1x2 > , (i.i) 

where > i = 1,2, denote the states of the beams going through slits 1 and 2, respectively, 
while \xi >> i = 1,2 are the (normalized) detector final states, and t is any time after the 
system has left the detection region. The structure of the interference fringes may be read off 
the probability density on the screen: 

\<x\^(ti)>\ 2 = |<a#i(ii) >| 2 + |<a#2(*i) >| 2 + 2 J Re{<^i(ti)|x><x|V>2(ii)><Xi|X2>} • 

(1.2) 

Depending on the value of < X1IX2 > there is a continuum between the extreme cases of 
no which-way detection (|xi >= |X2 >) 5 where the wave nature is exhibited by interference 
fringes with maximum contrast, and perfect which-way detection (< Xi|X2 >= 0), where the 
interference fringes disappear. For example, in the experimental realization || of Feynman's 
gedanken-expeiiment 0], the states \xi > describe the scattered photon needed to detect 
whether the atom (rather than the electron, as in the original discussion) has passed through 
slit 1 or 2 and the quantity < X1IX2 > can be varied by changing the spatial separation 
between the interfering paths at the point of scattering. In the experimental setup proposed 
in H the which-way detection is performed by micro-maser cavities inserted on the beams of 
previously exited atoms. Atomic decay in one of the cavities provides a which-way information 
whose predictability depends on the initial state of the cavities. However we should point 
out that the detector needs not be a separate physical system: the which-way information 
may indeed be stored in some internal degrees of freedom of the interfering particles, as it 
happens in neutron interference experiments ||, where the spin of the neutron in one of the 
beams is rotated with respect to the original common direction. Notice that, in each of these 



examples, the structure of the interference fringes, as it is clear from Eq. (1.2), depends on the 
entanglement of the system with the apparatus, from which a "which-way" information may be 
eventually recovered by means of an appropriate measurement, and not on the fact of actually 



performing it. Eq. (1.1) describes only a premeasurement. Therefore the actual measurement 
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relative to the "which-way" information may be arbitrarily delayed. As Schrodinger puts it, 
in his "general confession" motivated by the appearance of the Einstein, Podolsky, Rosen 
paper |Hj 5 "entanglement of predictions" goes "back to the fact that the two bodies at some 
earlier time formed in a true sense one system, that is were interacting, and have left behind 
traces on each other" . 

Furthermore, it should be stressed that, apart from the extreme case in which < X1IX2 >= 
0, no measurement can provide full information on the way that an individual quantum has 
taken. One is actually dealing with a problem in quantum detection theory, that is, in statistical 
decision theory. In order to decide what measurement should be carried out to extract the 
best possible which-way information, it is necessary to spell out a strategy in which an a priori 
evaluation criterion is given. 

In the pioneering work of Wootters and Zurek, Shannon's definition of information en- 
tropy H was taken as a quantitative measure of the gain in "which-way" information obtained 
by actually performing a measurement on the detector state. In this framework evidence was 
produced that "the more clearly we wish to observe the wave nature ...the most information 
we must give up about its particle properties". Following this suggestion, Englert [[H]], by 
using a different criterion for evaluating the available information, was able to establish, for 
equally populated beams, a complementarity relationship between the distinguishability, that 
gives a quantitative estimate of the ways, and the visibility that measures the quality of the 
interference fringes: 

V 2 + V 2 <1, (1.3) 

with equality sign holding if the detector is prepared in a pure state. As usual V is defined 
in terms of the maximum and minimum intensity of the fringes {Im and I m ), V = {Im — 
I m )/(lM + Im)- is simply related to the optimum average Bayes's cost C op t, traditionally 
used in decision theory, by the relation V = 1 — 2 C op t []. 

New problems arise in going from the case of two beams to a multibeams interference 
process. As shown by Diirr filfl , the complementarity relationship [Eq. (|1.3|)1 still holds when 
the visibility and the distinguishability are taken to be, the first as the, properly normalized, 
deviation of the fringes intensity from its mean value, and the second, following an alternative 
notion of entropy introduced in Ref. |12j] , as the maximum average rms spread of the a 
posteriori probabilities for the different paths (see Sec. 2). 

The purpose of this paper is to examine an interesting physical aspect of the problem, 
that seems to have been overlooked, so far, and it is the following: once a specific criterion to 
measure the which-way information is chosen, what is the actual measurement that has to be 
performed on the detectors, in order to extract the optimum information? The usual attitude 

^In Ref. |n| the distinguishability is expressed in terms of the optimum likelihood C op t for "guessing the 
way right" . This optimum likelihood is one minus the optimum average Bayes cost C op t 



3 



to address this question, is to consider the set Ad of all observables A, relative to the detector, 
and to search, among them, for the observable that delivers most information. However, it is 



known from quantum detection theory [13, 14], that the amount of information that can be 
obtained in this way does not represent, in general, the absolute maximum. Sometimes, it is 
possible to do a better job by introducing, in addition to the detector, an ancilla, namely an 
auxiliary quantum system, neither interacting with the detector, nor having any correlation 
with it. Despite the fact that the detector and the ancilla are, under all respects, independent 
systems, it may happen that a larger amount of information can be obtained, by measuring an 
observable relative to the combined system. In connection with this issue, we point out that, 
even if the quantity V appearing in Eq. (|1.3|) is usually defined in relation with Ad, the proofs 



leading to Eq. (|l.3| ), say in Refs. []1C|, |ll[| , remain valid if one includes the observables for the 



system formed by the detector and the ancilla together. It follows that the quantity V really 
refers to all possible detector+ancilla systems. 

Since the need for an ancilla seems to us a source of undesirable complication for the 
read out apparatus, it would be interesting to know under what circumstances the ancilla 
is really required. In particular, it would be interesting to know if there exist criteria to 
measure the which-way information, such that the optimal measurement turns out to be an 
ordinary observable relative to the detector, and the inclusion of an ancilla does not lead to 
any improvement. We show that, in the case of two-beams interference experiments, with 
either one of the two proposed measures of information, the optimal measurement does not 
involve an ancilla. On the contrary, in the case of multibeam experiments, it is only with the 
criterion introduced in Ref. |ll[ that the ancilla is unnecessary, while it is required for the 
other two criteria, in general. It is interesting to notice that the criterion for which ordinary 
measurements are good enough is the one that leads to the complementarity relation given by 
Eq. ( |1.3D . Finally, let us notice that, while inspired by the problem of complementarity in 
interference experiments, our work is a contribution to the difficult problem of optimization in 
quantum decision theory. 

The paper is organized as follows. In Sec. 2 the quantum detection problem for non- 
mutually orthogonal detector states is presented and the notion of ancilla is introduced. We 
review a fundamental theorem by Neumark, stating that measurements involving an ancilla in 
the enlarged detector-ancilla Hilbert space, can be equivalently described by means of positive 
operator-valued measures (POVM) on the detectors's Hilbert space, generalizing the ordinary 
projection-valued measures (PVM), that describe measurements not involving the ancilla. We 
then list the conditions that must be satisfied by any function, for it to be a good measure of 
the amount of information provided by a POVM. The different choices present in the literature 
for such a function are considered, and the resulting optimization problems are studied in Sec. 
3, for the case of two beams, and in Sec. 4 for multibeam interferometers. Some of the proofs 
are postponed to an Appendix. Final remarks and a discussion of perspectives close the paper. 
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2 The quantum decision problem. 



We consider a n-beam interference experiment: a single beam of identical microscopic systems, 
like photons, electrons, neutrons, atoms etc. (generically referred to as particles), is divided 
into n spatially separated beams by some sort of beam-splitter, like a screen with n slits. The 
n beams are then recombined on a screen, and the interference figure is observed. It is assumed 
that the intensity of the beam is adjusted so that only one particle at a time passes trough 
the interferometer, and that the populations Q of each of the n beams can be adjusted at will. 
We imagine now that a detector, designed to provide which- way information on individual 
particles passing through the interferometer, is placed along the trajectories of the beams. It 
is assumed that the detector also can be treated as a quantum system, and that the system- 
detector interaction gives rise to some unitary process. The detector will serve as which-way 
detector if, once prepared in some fixed state |xo >) it is brought by the interaction with the 
particles into a new state, that depends on the beam occupied by the particle. In formulae, 
this amounts to requiring that, after the interaction, the state of the particle-detector system 
is the following entangled state, generalizing [Eq. ( |1.1| )]: 

n 

Y,°i IV 3 * > ®\Xi> • (2-1) 

i=l 

Here, > denote the normalized particles wave-functions for the individual beams, while 
\Xi > are n normalized (but not necessarily orthogonal !) states of the which-way detectors. 
We define the detector's Hilbert space TLd as the linear span of the states \xi >■ 

H D :=span{|xi > , i = l,...n} . (2.2) 

(Of course, it may very well happen that the set of all possible states of the detector, as a 
physical system, is actually larger than Tin-) In concrete experiments \xi > may in fact be 
internal states of the particles themselves, in which case > denotes the space-part of the 
particles wavefunction. We assume that the amplitudes Cj are known in advance, such that 
the weights Q = \ c i\ 2 give the a priori probabilities for a particle to pass through the i-th slit. 
The state [Eq. ( |2.1[) 1 describes a situation in which there is complete correlation between the 
beams and the internal states of the detector, such that, if the detector is found to be in the 
state \xi >, one can tell with certainty that the particle passed through the i-th slit. Thus the 
problem of determining the trajectory of the particle reduces to the following one: after the 
passage of each particle, is there a way to decide in which of the n states \xi > the detector 
was left? If the states \xi> are orthogonal to each other, the answer is obviously yes. Indeed, 
if we let Ad the set of all hermitean operators in TCd, we can surely find in Ad an observable 
A , such that: 

A \xi >= Aj \xi > , \^ Aj for i^j. (2.3) 

If A is measured, and the result Aj is found, one can infer with certainty that the detector was 
in the state \xi >■ If, however, the states \xi > are not orthogonal to each other, for no choices 
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of A one can fulfil Eq. Q2.3| ): whichever A one picks, there will be at least one eigenvector 
of A, having a non-zero projection onto more than one state \\i >• Therefore, when the 
corresponding eigenvalue is obtained as the result of a measurement, no unique detector-state 
can be inferred, and only probabilistic judgments can be made. Under such circumstances, 
the best one can do is to select the observable that provides as much information as possible, 
on the average, namely after many repetitions of the experiment. Of course, this presupposes 
the choice a definite criterion to measure the average amount F(A) of which-way information 
delivered by a certain observable A (the properties of F(A), and the various choices proposed 
so far for this quantity are discussed later in this Section). After this choice is made, the 
distinguishability T> of the trajectories is usually related to the supremum, Fjj, of F(A), over 

It may now come as a surprise to notice, as pointed out in the Introduction, that the quan- 
tity Fd does not always represent the absolute maximum information that is actually available. 
Indeed, it is an intriguing feature of the quantum detection problem, for non orthogonal states, 
that a larger amount of information on the state of the detector can be obtained by considering 
the detector in combination with an auxiliary quantum system, called ancilla 14]. The 



i-aux i 



ancilla does not interact with the detector, and is prepared in a fixed known state \<j)o >€ TL a 
such that the combined system is in one of the n uncorrelated states \xi > ® \4>o >, belonging 
to the total Hilbert space TLtot = Hd ® TL aU x- Let now Atot the set of all hermitean operators 
in TLtot and F to t the supremum of F(A) over Atot ■ Surprisingly enough, even if the detector 
and the ancilla are uncorrelated, it may happen that Ftot > Fjj, showing that the inclusion of 
an ancilla may improve the amount of which-way information that can be read-out from the 
detectors. 

Since the state of the ancilla is fixed once and for all, it is possible though to express the 
probabilities of the possible outcomes resulting from the measurement of any observable Atot 
in TLtot-, in terms of quantities defined directly in TLd- We let P^ , p = 1, . . . , N the orthogonal 
decomposition of the identity in TLtot, relative to Atot (we consider for simplicity an observable 
with a finite number ./V of distinct outcomes). Then, the probability Pi a that the outcome p 
is observed, in the state \xi > ® \4>o > is given by the well known formula: 

Pi„ = Tr[P„(p i ®p aux )] (2.4) 

where pi = \x% >< Xi\ an d Paux = \4>o >< 0o [ - If the trace is performed in two steps, first on 
the ancillary Hilbert space and then on TLd, we can rewrite the above expression as 

P^ = Tr [Afj, Pi ] , (2.5) 

where 

Aft = TrauxlP^l ® Paux)] , (2.6) 

and Tr aux denotes the partial trace over the ancilla Hilbert space. The hermitean operators 
A^ belong to Ad, and it is easy to check that they are positive definite, and that they provide 
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a decomposition of the identity on TLd'- 



£A„ = 1, on H D (2.7) 

However, in general, they are not projection operators, neither they commute with each other. 
We point out also that the number N of different outcomes needs not be the same as neither 
the number n of detector-states, nor the dimensionality of Hd- The collection {A^} of op- 
erators constitutes an example of a positive operator- valued measure (POVM) in Hd. More 



generally [13, 14 1, a POVM is a map that associates to every (Borel) subset A of the real line 
R, a non-negative (self-adjoint) operator 11(A), such that: 

i) the empty set is mapped to zero; 

ii) the entire real line is mapped to the identity operator: 

iii) the union of any number of disjoint sets is mapped to the sum of the corresponding oper- 
ators. 

The probability P(A) for the outcome to be in the set A is given by the following expression, 
generalizing equation (^): 

P(A)=Tr|>n(A)] . (2.8) 

The axioms i), ii) and iii) listed above ensure the consistency of the above probabilistic inter- 
pretation. POVM's thus represent a generalization of the projection- valued measures (PVM), 
usually considered in Quantum Mechanics, and it is a theorem due to Neumark fUlfl , that 
all POVM's on Tip can be realized by means of an appropriate ancillary system, in the way 
sketched above. Since any quantum system not interacting with the detector can play the role 
of the ancilla, this theorem implies that every POVM can be realized by an experimental pro- 
cedure falling within the usual framework of Quantum Mechanics. Thus, in order to determine 
what is the maximum amount of which-way information that can obtained by observing the 
detector, we should maximize F over the set of all POVM's vciTio, and not just over the set 
of all PVM's. 

It is time now to define precisely the average which-way information F delivered by a POVM. 
For any POVM {A^ , \i = 1, . . . , iV} (we shall always consider POVM with a finite number N 
of different outcomes, in what follows), consider the a posteriori probabilities Qi^ for observing 
the \i— th outcome, when the detector is in the state \xi>- According to Bayes' formula: 

Qi, = ^ , (2.9) 

where is the a priori probability for the occurrence of the outcome ji: 

<fo = £C*fV (2-10) 

i 

In order to measure the amount of which-way information, that is gained if the //-th outcome 
is observed, we consider the quantity F^ = F(Q^), where = (Qi^ . ■ ■ , Q n ^) and F is some 
function. It is reasonable to require from F the following properties: 
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(1) F should be invariant under any permutation of its n arguments. 

(2) F should reach its absolute minimum when its N arguments are all equal to 1/N (which 
corresponds to complete lack of information on the detector state); 

(3) F should reach its absolute maximum when any of its arguments is equal to one, while all 
the others are equal to zero (which on the contrary corresponds to certain knowledge of the 
detector state); 

(4) F should be convex, i.e. for any A € [0, 1] it should hold: | 



F(XQ' + (1 - X)Q") < XF(Q') + (1 - X)F(Q") . 



(2.11) 



The intuitive meaning of this condition is clear if we interpret Q' and Q" as giving the a 
posteriori probabilities of n alternative hypothesis, for two distinct tests A' and A" . For any 
A E [0,1], we can consider the combination A\ of the tests A' and A", which consists in 
performing randomly either A' or A", with relative probabilities A and 1 — A, respectively. 
Equation ( p. 11 ) than states that the test A\ cannot carry more information than the weighted 
sum of the informations obtained from A' and A", separately. 

The overall average information delivered by the POVM is defined as the average F of the 
numbers Fu, over all possible outcomes, weighted with the a priori probabilities q^: 



(2.12) 



The optimization problem consists in searching for the POVM which maximizes F. Notice 
that, among the unknowns, we have to consider also the number N of elements of the POVM. 
Of course, the solution depends on the choice of the function F, above. Over the past years, 
several different choices have been adopted. For example, as we said in the Introduction, the 
authors of Refs. p, 14, 15] consider the negative of Shannon's entropy || H, which corresponds 
to taking: 



Fn = -Hp := Qifi logQi M 



References pfl, 13] use the negative of Bayes' cost function C: 



where, for each fx, j(fj,) is any index such that QjU)^ = Max{Q 



i/i • 



recently, Diirr [11] considered the normalized rms spread K: 



n 



1/2 



(2.13) 
(2.14) 

,Q nfl }. Finally, more 
(2.15) 



*F is said to be strictly convex, if the equality sign in Eq. (2.11) holds if and only if the vectors Q'^ and Q'/ t 
coincide. 
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When n = 2, it is easy to check that = 1 — 2 C^, and thus the two criteria ( p. 14 ) and ( p. 15 ) 
are inequivalent only for more than two beams. Notice also that, while Shannon's entropy and 
the rms spread are strictly convex, the Bayes cost function is only convex. 

Solving the optimization problem is a difficult task, and so far no general solution is known. 
However, partial results are available. For POVM's consisting of a finite number of elements, 



by using the convexity of the function F, it is easy to show 15] that the optimal POVM can 
be chosen to consist of rank one operators, namely: 



A, 



1^ >< <A 



Ml ' 



(2.16) 



where 



< 1. Moreover, if Tip is finite dimensional and d is its dimension, it has been 



shown [ 15 1 that the number N of elements of the optimal POVM can be taken to satisfy: 

d<N<d 2 . (2.17) 



3 Two-beam interferometers. 

In this short Section, we consider a two-beam interferometer. For such pointed out 

in the previous Section, the criterion using the Bayes cost function [Eq. ( |2.14| )] turns out to 
be equivalent to that based on the rms spreads [Eq. ( [2. 15 )]. The quantum detection problem, 



with the Bayes cost function as measure of information, is studied at length in Ref . [jD]] . There, 
it is shown that, for any number n of linearly independent states \xi > and arbitrary a priori 
probabilities Q, the optimal measurement is always a PVM. Since, in two-beam interferometers, 
the detector states \xi > and \\2 > must be distinct, for any path discrimination to be possible, 
they are necessarily linearly independent and thus it follows, from the quoted result, that the 
optimal measurement is a PVM. 

To our knowledge, there is no published proof that the optimal measurement is a PVM, even 
when one uses Shannon's entropy, as a measure of the which-way information. We have proven 
it, in the special case of equally populated beams, Q = 1/2. The rather elaborate proof can 
be found in the Appendix. When the populations Q are different, we have not been able to 
work out an analytical proof, but a number of numerical simulations performed for various 
choices of the populations, seem to indicate that the optimal measurement is a PVM also in 
this general case. 

In conclusion, it appears that for two-beam interferometers, both with Bayes's cost or with 
Shannon's information as measures of which-way information, ordinary PVM's can read out the 
maximum which-way information from the detectors, and recourse to ancillas is superfluous. 
In fact, it turns out that the optimal PVM is the same, for both criteria (see Eq. ( |7.12|) in the 
Appendix). 
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4 Multi-beams interferometers. 



In this Section we study the case of multi-beam interferometers, with n > 2 beams. We 
make the simplifying assumption that Tin is two-dimensional. This case is actually realized in 
experiments using beams of spin-half particles or photons, if the path information is stored in 
the internal states of the interfering particles. A further simplifying assumption that we make 
is that the beams are equally populated: £i = l/n. 

TCd is isomorphic to C 2 , the set of all pairs of complex numbers. As it is well known, rays of 
C 2 can be put in one-to-one correspondence with unit three-vectors h = (n x ,n v ,n z ), via the 
map: 

1 + h-a. . , . 
2 |X>=|X> ; (4.1) 

where a = (a x , cr y ,a z ) is a set of Pauli matrices. Thus, assigning n pure states \\i > amounts 
to picking n unit vectors hi in R 3 . Whether the optimal test is a PVM or rather a POVM, 
now depends on the choice of the function F. Below, we consider in detail the three choices 
for F, Eqs. ( 2.13; ), ( |2.14| ) and ( p. 15 ), so far considered in the literature. 

a) F is the negative of Shannon's entropy H [Eq. ( |2.13 )]. For three or more beams, it is known 
that the optimal test, in general, is not a PVM but rather a POVM. For example, for three 
states hi, h2 and h% forming angles of 120° with each other and such that Ya=i = 0, it has 
been shown [14| that the optimal test is provided by the following POVM with three elements: 

Ai = ^(l-hi-a) (4.2) 

b) F is the negative of Bayes' cost function C [Eq. Q2. 14 )] . Here too, the optimal test is not a 
PVM, but a POVM. An example is again provided by the set of three symmetric pure states 
considered under case (a) above. It is shown in [13] that the optimal POVM is given this time 
by the following POVM with three elements: 

Ai = -{l + h v a) . (4.3) 



Notice that the above POVM is not the same as [Eq. Q4.2[ )], which is an example of the fact 
that the solution of the optimization problem depends on the choice of F. 



c) F is given by the rms spread K [Eq. (2.15)]. Remarkably enough, we can show that, for 
any number n of equally populated beams, the optimal test is always a PVM. This is in sharp 
contrast with what happens for the two other choices of F previously considered. To prove 
this claim, consider an optimal POVM, A = {A^; fj, = 1, . . . N}. We know, from Sec. 2, that 
the operators A^ must be of the form ( 2.16[ ). Using Eq. fl4.1| ), we can write: 

A^ = a^l + m^-a) , (4.4) 

where are N unit three-vectors, and are N positive numbers. The condition for a 
POVM, J2n = 1, is then equivalent to: 

^2 a M = i , a M"v = ° • ( 4 - 5 ) 
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In view of Eq. ( |4.4j ), we find: 



Pin ■=< XiWlXi >= + rhfj.-hi) . (4.6) 
Using this equation, we compute Eq. ( |2.10| ) as: 

= a M (l + 77vE& »i) • (4.7) 

i 

In order to evaluate the average information F(A) of A, it is convenient to rewrite the quantities 
q^K^ as 



n 



1 I n 4-f ^ *■ 



2 

i=l 



1/2 



(4.8) 



n 

Upon using Eqs. ([O]) and ( f4.7| ) into the above formula, we obtain, after a little algebra: 

1 /2 

^ M = \ -^[i + (»v E c^o 2 ] + E c 2 [i + (*v«o 2 l + 2m M - E c» - £) «* } 

l i i i ) 

(4.9) 

We observe now that, for equally populated beams, Q = 1/n, the last sum in the above equation 
vanishes, and the expression for q^K^ becomes invariant under the exchange of rh^ with —rha- 
Consider now the POVM B = {B+,B~; [i= 1, . . . , N}, consisting of 2N elements, such that: 

B+ = ^A^, B~ = X - a M (l - rV a) (4.10) 

Of course, q^K^ = q^K^/2, while the invariance of q^K^ implies q^ ' ' = q^K^ + ' 



It follows that the average informations for A and B are equal to each other, F(A) = F(B). 
Now, for each value of fj,, the pair of operators B^/a^ = (l±rrt^-a)/2 constitutes a PVM, 
and thus the POVM B can be regarded as a collection of N PVM's, each taken with a non- 
negative weight a„. But then F(B), being equal to the average of the amounts of information 
provided by N PVM's, cannot be higher than the maximum information Fp delivered by a 
PVM. Therefore, we have proven that F(A) = F(B) < Fp, which shows that the optimum 
measurement can always be effected by a means of PVM. 

We then see that, in the multibeam case, only with Diirr's measure of information one can 
dispose of the ancilla, at least for equally populated beams. 



5 Conclusions 

When, in an interference experiment, the which-way detector states are not mutually orthogo- 
nal, one has an incomplete knowledge of the path followed by the interfering particles. One is 
then faced with the problem of reading out, in an optimum way, the information stored in the 
detectors. The best measurement to be performed depends, in a crucial way, on the criterion 
used to measure the information. This is a problem in quantum decision theory, and our paper 
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is a contribution to the task of identifying the optimum quantum test, for which no general 
solution is known so far. 



We have shown that for the two beams case, both by using Shannon entropy or Bayes cost 
function as measures of information, the best test to be performed is given by an ordinary 
projection valued measurement in the detector's Hilbert space. Actually, it turns out that 
both criteria identify the same measurement. In the multibeam case only Diirr's normalized 
rms spread criterion leads to a PVM, while the other two lead to a POVM. Notice that in the 
case of three coplanar symmetric beam states one ends up with two different POVM's: the one 



relative to Bayes cost [Eq. (4.3)], allows every time to pick one beam as the most probable one, 
while the POVM determined by Shannon entropy, allows to exclude one of the three beams as 
impossible. 

We see, then, that in the multibeam case Diirr's criterion seems to be favoured for two 
different reason. First of all, it allows to derive a quantitative complementarity relation, as 
the one given by Eq. ( |1.3[ ). Second, it allows to work with ordinary quantum mechanical 
measurements, and to ignore generalized POVM's, involving an ancillary system. A possible 
relationship of these two features seams worth studying. This may be related to the fact that, 
as has been recently shown [ fTifl , there are problems in extending the mathematical definition 
of complementarity to a POVM. 

Our results are of limited generality in two respects: first, in the multibeam case they 
refer to two-state detectors, second, we always considered equally populated beams. For what 
concern the latter problem, we may add that we have gathered substantial numerical evidence 
that our results may extend to arbitrarily populated beams. However we lack at the moment an 
analytic proof. The former limitation seems more difficult to overcome. Fortunately, however, 
the case we have treated is physically interesting, for it includes many experimental setups in 
which the "which- way" detection exploits some two-states internal degrees of freedom of the 
interfering particles. 
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7 Appendix 

In this Appendix, we prove the following 
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Theorem: for a two-beams interferometer with equally populated beams, when one uses the 
negative of Shannon's entropy to measure the which-way information, the optimal measurement 
is provided by a PVM (precisely described in Eq. ( 7. 12^ below). 



More precisely, let \x+ > and \x~ > be the detector states, for the two beams. We exclude 
the trivial case, when \x+ > and \x~ > are proportional, because then no path-reconstruction 
would be possible. Therefore, Jin is two-dimensional and we can represent vectors in Tip by 
unit three vectors, according to Eq. Q4.1| ). We loose no generality if we assume that the unit 
vectors h + and n_, associated to \x+ > and \x~ > respectively, have the expressions: 



n. 



= (sin0, 0, cos 6>) , n_ = (-sin 5, 0, cos6>) , (7.11) 



With this parametrization for the states \x+ > and \x— >, our theorem states that, if the 
which-way information is measured by the negative of Shannon's entropy H, the optimal 
measurement is provided by the PVM A with elements: 

A + = ~(l + a x ) , A_ = i(l-<r x ). (7.12) 

Before giving the proof of this Theorem, it is useful to prove first the following 

Lemma: consider, in C 2 , n states \x% >, with coplanar vectors hi, and arbitrary populations 
Q. Then, the optimal POVM has elements of the form [Eq. ( |4.4| )], with all the vectors rh^ 
lying in the same plane containing the vectors hi. 

The proof of the lemma is as follows. Let B be an optimal POVM. Then we know, from 
the theorems quoted in Sec. 2, that its elements must have rank-one and so are of the form 
given in Eq. (4.4). Moreover, they must satisfy the POVM conditions given by Eqs. 0. 



( H) 

Suppose now that some of the vectors do not belong to the plane containing the vectors 
hi, which we assume to be the xz plane. We show below how to construct a new POVM 

A = {A p , v = 1, . . . N + p}, providing not less information than B, and such that the vectors 

(A) 

m v all belong to the xz plane. The first step in the construction of A consists in symmetrizing 
B with respect to the xz plane. The symmetrization is done by replacing each element B^ of 
B, not lying in the xz plane, by the pair (B^B'p, where B'^ = B^/2, and B'^ has the same 
weight as B'^, while its vector m^" is the symmetric of with respect to the xz plane. 
It is easy to verify that the symmetrization preserves the conditions for a POVM [Eqs. ( [4.5D ]. 
Since all the vectors hi belong by assumption to the xz plane, we see, from Eq. (|4.6|), that the 



probabilities actually depend only on the projections of the vectors in the plane xz. 
This implies, at is easy to check, that symmetrization with respect to the xz plane does not 
change the information F. We assume therefore that B has been preliminarily symmetrized 
in this way. Now we show that we can replace, one after the other, each pair of symmetric 
elements (B' ,B'') by another pair of operators, whose vectors lie in the xz plane, without 
reducing the information provided by the POVM. Consider for example the pair (B' p , B^). We 
construct the unique pair of unit vectors u p and v p , lying the xz plane, and such that: 

u p + v p = 2{m p B)x % + m p B)z k) , (7.13) 
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where % and j are the directions of the x and z axis, respectively. Notice that u p ^ v p . Consider 
now the collection of operators obtained by replacing the pair (B' Bp) with the pair (A' p , A'p 1 ) 
such that: 

A' p = a p B) (1 + u p - a) , A" p = a p B) (1 + v p - a) . (7.14) 

It is clear, in view of Eqs. (ffl3D , that the new collection of N + p operators still forms a 
resolution of the identity, and thus represents a POVM. Equations ( [7.13 ) also imply: 

Pg y = P^" = qp(l + ™f >X + mf^nX) = 

= \a p (l + u x p nf + ulnX) + \a p (l + v* p n? + „*n?) = \(P^' + P^") , (7.15) 

Now, define \' p := q p A) ' /(2q p B) ), and \ p := q ( p A) " /(2q { p B) ), where g£ B) := q p {B) = q p {B) . Since 
Qp^^' + qp A ^" = 2q p B \ we have X p + Ap = 1. It is easy to verify, using Eqs. ( |2.9| ) and ( p. 10 ), 
that: 

C }/ = Qlf" = ^ Q\f + Ql?" , (7.16) 
But then, the convexity of F implies: 



q} B) F^(Q} B )) + < B) F( fi )(Q^)) = 2q^F^(Q'W) = 
= 2qf)F{X' p Q p A y + \^"Q^") < 2gf )[A;F(Q(- 4 )') + X'^Q^")] = 

= q' P {A) F^ A HQ} A) ) + q; {A) F^(Q'^) . (7.17) 

It follows that the new POVM is no worse than B. By repeating this construction p times, we 
can obviously eliminate from B all the p pairs of elements not lying in the xz plane, until we 
get a POVM A, which provides not less information than B, whose elements all lie in the xz 
plane. This concludes the proof of the lemma. 

We can turn now to the proof of the Theorem, stated at the beginning of this Appendix. 
The proof consists in showing that the PVM A in [Eq. (|7.12|) ] provides not less information 

than any other POVM, C, consisting of more than two elements. By virtue of the lemma just 

(c) 

proven, we loose no generality if we assume that the N > 2 vectors of C lie in the xz 

plane. Our first move is to symmetrize C with respect to z axis, by introducing a POVM B, 
consisting of N pairs of elements (B'^, B'p, having equal weights, and vectors m'^ and m"^ that 
are symmetric with respect to the z axis: 

B'^= l -C^ B'; = ±aP(l-m*a x + m z ^ z ) , (i = l,...,N. (7.18) 
B provides as much information as C. Indeed, in view of Eq. ([4.6D, we find 



Pg> = 2i>jj' = 2P!$" , M = 1,...,JV . (7.19) 

The invariance of F with respect to permutations of its arguments, then ensures that F(B) = 
F(C). Thus, we loose no information if we consider a POVM B, that is symmetric with respect 
to the z axis. Now we describe a procedure of reduction that, applied to a symmetric POVM 
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like B, gives rise to another symmetric POVM B, which contains two elements less than B, 
but nevertheless gives no less information than B. The procedure works as follows: we pick at 
will two pairs of elements of B, say (B' N , B'^) and (B^ r _ 1 , B'^_ 1 ) and consider the unique pair 
of symmetric unit vectors u± = ±u x i + u z k such that: 

1 



a 



(B) 
N 



(B) 
N-l 



(B) m (B)z 



N 



m 



N 



+ a 



(B) (B)z 



N-l 



in 



N- 



5). 



(7.20) 



Consider the symmetric collection B, obtained from B after replacing the four elements 
(B' N ,B'} i ,B' N _ 1 ,B%_ x ) by the pair (B^-i.^-i) such that: 



(B) 



SB) 



a (B) (B) ~ 

N-i = K ut N + a N-i)( 1 + u + -&) , B N _ X = (a y N > +a y N i 1 )(l + u„-<r) . (7.21) 



B is still a POVM, as it is easy to verify. Moreover, B provides not less information than B, 
as we now show. Indeed, after some algebra, one finds: 



F(B)-F{B) 



a 



a 



(B) 
N 



SB) 



9(u 



N 



( H) 

+ «)v-l a N + a 7V-l a N' + a N-l 

where the function g(x) has the expression: 

g(x) = (1 + x cos 9) log(l + x cos 9)+ 



S B ) 



t (B)z, 



a 



(B) 
N-l 



(B) 
N 



S B ) 



-g{m N _ 1 ) , 



(7.22) 



--(1 + ZCOS0+(1 



,2x1/2 



— -(1 + X cos I 



sin 9) log 



^(l + xcos6> + (l 



,2x1/2 



sin0) 



+ 



-(l + xcos6>-(l-x 2 ) 1/2 



(1 -x 2 ) 1/2 sin 9) log 
In view of Eq. ( 7.20 ), the r.h.s. of Eq. ( [7.22 ) is of the form 

g{Xxi + (1 - A)x 2 ) - \g{xi) - (1 - A) 5(2:2) 



sm( 



(7.23) 



(7.24) 



where A = affi /{otffi + 



a 



(B) \ 
JV— 1/> 



while x\ 



m 



(B)z 
N 



and X2 = m-Tv-i- It may be checked that, for 



all values of 9, g(x) is concave, for x £ [—1, 1], and so the r.h.s. of Eq. ( |7.24j ) is non-negative 
for any value of A € [0, 1]. This implies that the r.h.s. of Eq. ( |7,22| ) is non-negative as well, 
and so F(B) > F(B). After N — 1 iterations of this procedure, we end up with a symmetric 
POVM consisting of two pairs of elements (B[,Bi) and (B^B'^). But then, the conditions 
for a POVM, Eqs. ( [4.5[ ), imply that the quantity between the brackets on the r.h.s. of Eq. 
( 7.20|) vanishes, and so Eq. (|7.20|) gives u z = 0. This means that the last iteration gives rise 
precisely to the PVM A in [Eq. ( |7,12j )l. By putting everything together, we have shown that 
F(C) = F(B) < F(B) ... < F(A), and this is the required result. 
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